Search CORE

101 research outputs found

Tiled Multiplane Images for Practical 3D Photography

Author: Khan Numair
Lanman Douglas
Xiao Lei
Publication venue
Publication date: 25/09/2023
Field of study

The task of synthesizing novel views from a single image has useful applications in virtual reality and mobile computing, and a number of approaches to the problem have been proposed in recent years. A Multiplane Image (MPI) estimates the scene as a stack of RGBA layers, and can model complex appearance effects, anti-alias depth errors and synthesize soft edges better than methods that use textured meshes or layered depth images. And unlike neural radiance fields, an MPI can be efficiently rendered on graphics hardware. However, MPIs are highly redundant and require a large number of depth layers to achieve plausible results. Based on the observation that the depth complexity in local image regions is lower than that over the entire image, we split an MPI into many small, tiled regions, each with only a few depth planes. We call this representation a Tiled Multiplane Image (TMPI). We propose a method for generating a TMPI with adaptive depth planes for single-view 3D photography in the wild. Our synthesized results are comparable to state-of-the-art single-view MPI methods while having lower computational overhead.Comment: ICCV 202

arXiv.org e-Print Archive

Temporally Consistent Online Depth Estimation Using Point-Based Fusion

Author: Khan Numair
Lanman Douglas
Penner Eric
Xiao Lei
Publication venue
Publication date: 01/05/2023
Field of study

Depth estimation is an important step in many computer vision problems such as 3D reconstruction, novel view synthesis, and computational photography. Most existing work focuses on depth estimation from single frames. When applied to videos, the result lacks temporal consistency, showing flickering and swimming artifacts. In this paper we aim to estimate temporally consistent depth maps of video streams in an online setting. This is a difficult problem as future frames are not available and the method must choose between enforcing consistency and correcting errors from previous estimations. The presence of dynamic objects further complicates the problem. We propose to address these challenges by using a global point cloud that is dynamically updated each frame, along with a learned fusion approach in image space. Our approach encourages consistency while simultaneously allowing updates to handle errors and dynamic objects. Qualitative and quantitative results show that our method achieves state-of-the-art quality for consistent video depth estimation.Comment: Supplementary video at https://research.facebook.com/publications/temporally-consistent-online-depth-estimation-using-point-based-fusion

arXiv.org e-Print Archive

Layered 3D: tomographic image synthesis for attenuation-based light field and high dynamic range displays

Author: Heidrich Wolfgang
Lanman Douglas R.
Raskar Ramesh
Wetzstein Gordon
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2011
Field of study

We develop tomographic techniques for image synthesis on displays composed of compact volumes of light-attenuating material. Such volumetric attenuators recreate a 4D light field or high-contrast 2D image when illuminated by a uniform backlight. Since arbitrary oblique views may be inconsistent with any single attenuator, iterative tomographic reconstruction minimizes the difference between the emitted and target light fields, subject to physical constraints on attenuation. As multi-layer generalizations of conventional parallax barriers, such displays are shown, both by theory and experiment, to exceed the performance of existing dual-layer architectures. For 3D display, spatial resolution, depth of field, and brightness are increased, compared to parallax barriers. For a plane at a fixed depth, our optimization also allows optimal construction of high dynamic range displays, confirming existing heuristics and providing the first extension to multiple, disjoint layers. We conclude by demonstrating the benefits and limitations of attenuation-based light field displays using an inexpensive fabrication method: separating multiple printed transparencies with acrylic sheets.Dolby Laboratories Inc.Samsung ElectronicsAlfred P. Sloan Foundatio

DSpace@MIT

Crossref

Single lens off-chip cellphone microscopy

Author: Arpa Aydin
Lanman Douglas R.
Raskar Ramesh
Wetzstein Gordon
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/06/2012
Field of study

Within the last few years, cellphone subscriptions have widely spread and now cover even the remotest parts of the planet. Adequate access to healthcare, however, is not widely available, especially in developing countries. We propose a new approach to converting cellphones into low-cost scientific devices for microscopy. Cellphone microscopes have the potential to revolutionize health-related screening and analysis for a variety of applications, including blood and water tests. Our optical system is more flexible than previously proposed mobile microscopes and allows for wide field of view panoramic imaging, the acquisition of parallax, and coded background illumination, which optically enhances the contrast of transparent and refractive specimens

DSpace@MIT

Tensor displays: compressive light field synthesis using multilayer displays with directional backlighting

Author: Hirsch Matthew Waggener
Lanman Douglas R.
Raskar Ramesh
Wetzstein Gordon
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2012
Field of study

We introduce tensor displays: a family of compressive light field displays comprising all architectures employing a stack of time-multiplexed, light-attenuating layers illuminated by uniform or directional backlighting (i.e., any low-resolution light field emitter). We show that the light field emitted by an N-layer, M-frame tensor display can be represented by an Nth-order, rank-M tensor. Using this representation we introduce a unified optimization framework, based on nonnegative tensor factorization (NTF), encompassing all tensor display architectures. This framework is the first to allow joint multilayer, multiframe light field decompositions, significantly reducing artifacts observed with prior multilayer-only and multiframe-only decompositions; it is also the first optimization method for designs combining multiple layers with directional backlighting. We verify the benefits and limitations of tensor displays by constructing a prototype using modified LCD panels and a custom integral imaging backlight. Our efficient, GPU-based NTF implementation enables interactive applications. Through simulations and experiments we show that tensor displays reveal practical architectures with greater depths of field, wider fields of view, and thinner form factors, compared to prior automultiscopic displays.United States. Defense Advanced Research Projects Agency (DARPA SCENICC program)National Science Foundation (U.S.) (NSF Grant IIS-1116452)United States. Defense Advanced Research Projects Agency (DARPA MOSAIC program)United States. Defense Advanced Research Projects Agency (DARPA Young Faculty Award)Alfred P. Sloan Foundation (Fellowship

CiteSeerX

DSpace@MIT

Multisource Holography

Author: Cossairt Oliver
Kuo Grace
Lanman Douglas
Matsuda Nathan
Schiffers Florian
Publication venue
Publication date: 19/09/2023
Field of study

Holographic displays promise several benefits including high quality 3D imagery, accurate accommodation cues, and compact form-factors. However, holography relies on coherent illumination which can create undesirable speckle noise in the final image. Although smooth phase holograms can be speckle-free, their non-uniform eyebox makes them impractical, and speckle mitigation with partially coherent sources also reduces resolution. Averaging sequential frames for speckle reduction requires high speed modulators and consumes temporal bandwidth that may be needed elsewhere in the system. In this work, we propose multisource holography, a novel architecture that uses an array of sources to suppress speckle in a single frame without sacrificing resolution. By using two spatial light modulators, arranged sequentially, each source in the array can be controlled almost independently to create a version of the target content with different speckle. Speckle is then suppressed when the contributions from the multiple sources are averaged at the image plane. We introduce an algorithm to calculate multisource holograms, analyze the design space, and demonstrate up to a 10 dB increase in peak signal-to-noise ratio compared to an equivalent single source system. Finally, we validate the concept with a benchtop experimental prototype by producing both 2D images and focal stacks with natural defocus cues.Comment: 14 pages, 9 figures, to be published in SIGGRAPH Asia 202

arXiv.org e-Print Archive

Content-adaptive parallax barriers: optimizing dual-layer 3D displays using low-rank light field factorization

Author: Hirsch Matthew Waggener
Kim Yun Hee
Lanman Douglas R.
Raskar Ramesh
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2010
Field of study

We optimize automultiscopic displays built by stacking a pair of modified LCD panels. To date, such dual-stacked LCDs have used heuristic parallax barriers for view-dependent imagery: the front LCD shows a fixed array of slits or pinholes, independent of the multi-view content. While prior works adapt the spacing between slits or pinholes, depending on viewer position, we show both layers can also be adapted to the multi-view content, increasing brightness and refresh rate. Unlike conventional barriers, both masks are allowed to exhibit non-binary opacities. It is shown that any 4D light field emitted by a dual-stacked LCD is the tensor product of two 2D masks. Thus, any pair of 1D masks only achieves a rank-1 approximation of a 2D light field. Temporal multiplexing of masks is shown to achieve higher-rank approximations. Non-negative matrix factorization (NMF) minimizes the weighted Euclidean distance between a target light field and that emitted by the display. Simulations and experiments characterize the resulting content-adaptive parallax barriers for low-rank light field approximation.National Science Foundation (U.S.) (grant CCF-0729126)National Research Foundation of Korea (grant 2009-352-D00232

CiteSeerX

DSpace@MIT

Crossref

Waveguide Holography: Towards True 3D Holographic Glasses

Author: Bang Kiseung
Chae Minseok
Jang Changwon
Lanman Douglas
Lee Byoungho
Publication venue
Publication date: 04/11/2022
Field of study

We present a novel near-eye display concept which consists of a waveguide combiner, a spatial light modulator, and a laser light source. The proposed system can display true 3D holographic images through see-through pupil-replicating waveguide combiner as well as providing a large eye-box. By modeling the coherent light interaction inside of the waveguide combiner, we demonstrate that the output wavefront from the waveguide can be controlled by modulating the wavefront of input light using a spatial light modulator. This new possibility allows combining a holographic display, which is considered as the ultimate 3D display technology, with the state-of-the-art pupil replicating waveguides, enabling the path towards true 3D holographic augmented reality glasses

arXiv.org e-Print Archive

BiDi screen: a thin, depth-sensing LCD for 3D interaction using light fields

Author: Hirsch Matthew Waggener
Holtzman Henry N.
Lanman Douglas R.
Raskar Ramesh
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/12/2009
Field of study

We transform an LCD into a display that supports both 2D multi-touch and unencumbered 3D gestures. Our BiDirectional (BiDi) screen, capable of both image capture and display, is inspired by emerging LCDs that use embedded optical sensors to detect multiple points of contact. Our key contribution is to exploit the spatial light modulation capability of LCDs to allow lensless imaging without interfering with display functionality. We switch between a display mode showing traditional graphics and a capture mode in which the backlight is disabled and the LCD displays a pinhole array or an equivalent tiled-broadband code. A large-format image sensor is placed slightly behind the liquid crystal layer. Together, the image sensor and LCD form a mask-based light field camera, capturing an array of images equivalent to that produced by a camera array spanning the display surface. The recovered multi-view orthographic imagery is used to passively estimate the depth of scene points. Two motivating applications are described: a hybrid touch plus gesture interaction and a light-gun mode for interacting with external light-emitting widgets. We show a working prototype that simulates the image sensor with a camera and diffuser, allowing interaction up to 50 cm in front of a modified 20.1 inch LCD.National Science Foundation (U.S.) (Grant CCF-0729126)Alfred P. Sloan Foundatio

DSpace@MIT

Perceptual Requirements for World-Locked Rendering in AR and VR

Author: Guan Phillip
Hegland Joel
Lanman Douglas
Letham Benjamin
Penner Eric
Publication venue
Publication date: 27/03/2023
Field of study

Stereoscopic, head-tracked display systems can show users realistic, world-locked virtual objects and environments. However, discrepancies between the rendering pipeline and physical viewing conditions can lead to perceived instability in the rendered content resulting in reduced realism, immersion, and, potentially, visually-induced motion sickness. The requirements to achieve perceptually stable world-locked rendering are unknown due to the challenge of constructing a wide field of view, distortion-free display with highly accurate head- and eye-tracking. In this work we introduce new hardware and software built upon recently introduced hardware and present a system capable of rendering virtual objects over real-world references without perceivable drift under such constraints. The platform is used to study acceptable errors in render camera position for world-locked rendering in augmented and virtual reality scenarios, where we find an order of magnitude difference in perceptual sensitivity between them. We conclude by comparing study results with an analytic model which examines changes to apparent depth and visual heading in response to camera displacement errors. We identify visual heading as an important consideration for world-locked rendering alongside depth errors from incorrect disparity

arXiv.org e-Print Archive